Skip to content

Conversation

@ownia
Copy link
Contributor

@ownia ownia commented Jan 28, 2026

I encountered an error while testing #18825 (on the latest master branch):

(.venv) ➜  llama.cpp git:(master) ✗ ./build/bin/llama-cli -m PaddleOCR-VL-GGUF.gguf \
  --mmproj PaddleOCR-VL-GGUF-mmproj.gguf \
  --color on \
  --image test.jpg \
  --prompt "OCR:" \
  --reasoning-budget 0
ggml_metal_device_init: tensor API disabled for pre-M5 and pre-A19 devices
ggml_metal_library_init: using embedded metal library
ggml_metal_library_init: loaded in 0.010 sec
ggml_metal_rsets_init: creating a residency set collection (keep_alive = 180 s)
ggml_metal_device_init: GPU name:   Apple M1
ggml_metal_device_init: GPU family: MTLGPUFamilyApple7  (1007)
ggml_metal_device_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_device_init: GPU family: MTLGPUFamilyMetal4  (5002)
ggml_metal_device_init: simdgroup reduction   = true
ggml_metal_device_init: simdgroup matrix mul. = true
ggml_metal_device_init: has unified memory    = true
ggml_metal_device_init: has bfloat            = true
ggml_metal_device_init: has tensor            = false
ggml_metal_device_init: use residency sets    = true
ggml_metal_device_init: use shared buffers    = true
ggml_metal_device_init: recommendedMaxWorkingSetSize  = 12713.12 MB

Loading model...


▄▄ ▄▄
██ ██
██ ██  ▀▀█▄ ███▄███▄  ▀▀█▄    ▄████ ████▄ ████▄
██ ██ ▄█▀██ ██ ██ ██ ▄█▀██    ██    ██ ██ ██ ██
██ ██ ▀█▄██ ██ ██ ██ ▀█▄██ ██ ▀████ ████▀ ████▀
                                    ██    ██
                                    ▀▀    ▀▀

build      : b7852-eef375ce1
model      : PaddleOCR-VL-GGUF.gguf
modalities : text, vision

available commands:
  /exit or Ctrl+C     stop or exit
  /regen              regenerate the last response
  /clear              clear the chat history
  /read               add a text file
  /image <file>       add an image file

Loaded media from 'test.jpg'

> OCR:

WARNING: Using native backtrace. Set GGML_BACKTRACE_LLDB for more info.
WARNING: GGML_BACKTRACE_LLDB may cause native MacOS Terminal.app to crash.
See: https://github.com/ggml-org/llama.cpp/pull/17869
0   libggml-base.0.9.5.dylib            0x000000010088136c ggml_print_backtrace + 276
1   libggml-base.0.9.5.dylib            0x0000000100895ec8 _ZL23ggml_uncaught_exceptionv + 12
2   libc++abi.dylib                     0x000000018f594c2c _ZSt11__terminatePFvvE + 16
3   libc++abi.dylib                     0x000000018f598648 __cxa_increment_exception_refcount + 0
4   llama-cli                           0x00000001002e12d0 _ZN5jinja9statement7executeERNS_7contextE + 140
5   llama-cli                           0x000000010023862c _ZN5jinja7runtime7executeERKNS_7programE + 172
6   llama-cli                           0x0000000100237cf4 _ZL5applyRK20common_chat_templateRK16templates_paramsRKNSt3__18optionalIN8nlohmann16json_abi_v3_12_010basic_jsonINS8_11ordered_mapENS5_6vectorENS5_12basic_stringIcNS5_11char_traitsIcEENS5_9allocatorIcEEEEbxydSF_NS8_14adl_serializerENSB_IhNSF_IhEEEEvEEEESO_SO_ + 2300
7   llama-cli                           0x0000000100236e80 _ZL37common_chat_params_init_without_toolsRK20common_chat_templateRK16templates_params + 112
8   llama-cli                           0x000000010022b47c _ZL33common_chat_templates_apply_jinjaPK21common_chat_templatesRK28common_chat_templates_inputs + 18976
9   llama-cli                           0x000000010010cd48 _ZN11cli_context11format_chatEv + 308
10  llama-cli                           0x0000000100106454 _ZN11cli_context19generate_completionER14result_timings + 80
11  llama-cli                           0x00000001001051f8 main + 4912
12  dyld                                0x000000018f219d54 start + 7184
libc++abi: terminating due to uncaught exception of type jinja::rethrown_exception:
------------
While executing For at line 14, column 13 in source:
...%}↵        {{- "User: " -}}↵        {%- for content in message["content"] -%}↵  ...
                                           ^
Error: Expected iterable or object type in for loop: got String
[1]    53856 abort      ./build/bin/llama-cli -m PaddleOCR-VL-GGUF.gguf --mmproj  --color on --image

The original chat template is at https://huggingface.co/PaddlePaddle/PaddleOCR-VL/blob/main/chat_template.jinja
I use git bisect to find this commit (6df686b) introduced a change which cleaned up the chatml fallback and it is the first bad commit. I think the root cause is that jinja templates will fail when they receive plain strings during the content array indexing (for llama-cli case). So this PR will detect templates that expect typed/array-style message content and convert string contents into a typed-content array if requires_typed_content.

@ownia ownia changed the title common : convert string contents to arrays if template requires typed… common : convert string contents to arrays if template requires typed content Jan 28, 2026
Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not be a good fix. Formatting can fail for multiple reasons, so assuming it fails due to typed content will break other templates.

Instead, we should have a system to enable a cap, then retry the formatting to verify if it actually works. I'll have a look on that later

@ownia
Copy link
Contributor Author

ownia commented Jan 28, 2026

Yeah, this is a quick fix that I've validated locally, we should be cautious about template formatting. Feel free to modify my patch.

@CISC
Copy link
Collaborator

CISC commented Jan 28, 2026

Funnily enough it's not illegal to pass a string to a for loop in jinja2 (since strings are both sequence and iterable), it just doesn't give any reasonable output, I think I prefer it to fail like we do as it is most certainly a bug to do so.

Edit: Reasonable as in every single character of the string, which hopefully no chat template will use/expect to work.

@ownia
Copy link
Contributor Author

ownia commented Jan 28, 2026

Yes, I think the current error message is reasonable. I'm just thinking about how to make llama-cli more flexible (and stable) in accepting template, rather than attempting to update the original template.

@github-actions github-actions bot added the jinja parser Issues related to the jinja parser label Jan 28, 2026
@pwilkin
Copy link
Collaborator

pwilkin commented Jan 28, 2026

It's a templating problem. I'm working on it in the autoparser branch. Basically, for some templates we need to vary whether we treat an unquoted value in a field as a string or as a JSON object based on the schema for the tool call, i.e.

<arg name=baz>
foo
</arg>

is a string when baz is typed as a string and

<arg name=baz>
["foo"]
</arg>

should be a JSON array (not a string "["foo"]") when it's typed as an array.

@pwilkin
Copy link
Collaborator

pwilkin commented Jan 28, 2026

Oh, sorry, I misread, this is about converting content, not tool call args. Well, the general problem is the same: we need to be able to determine which one we should parse. The key thing is when streaming, you cannot just "change your mind" at some point because the delta will be invalid, so you have to actually know in advance whether the content you'll be getting is an array or a string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jinja parser Issues related to the jinja parser

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants